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DETAILED ACTION 

1 . This action is in response to application 1 0/695, 1 25 filed October 28, 2003. 
Claims 1-24 are pending in the application and have been examined. 

Drawings 

2. The drawings are objected to because they are hard to read and understand. 
Cleaner drawings should be submitted. Corrected drawing sheets in compliance with 
37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the 
application. Any amended replacement drawing sheet should include all of the figures 
appearing on the immediate prior version of the sheet, even if only one figure is being 
amended. The figure or figure number of an amended drawing should not be labeled as 
"amended." If a drawing figure is to be canceled, the appropriate figure must be 
removed from the replacement sheet, and where necessary, the remaining figures must 
be renumbered and appropriate changes made to the brief description of the several 
views of the drawings for consistency. Additional replacement sheets may be necessary 
to show the renumbering of the remaining figures. Each drawing sheet submitted after 
the filing date of an application must be labeled in the top margin as either 
"Replacement Sheet" or "New Sheet" pursuant to 37 CFR 1.121(d). If the changes are 
not accepted by the examiner, the applicant will be notified and informed of any required 
corrective action in the next Office action. The objection to the drawings will not be held 
in abeyance. 
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Claim Rejections - 35 USC §112 

3. The following is a quotation of the second paragraph of 35 U/S.C. 112: 

The specification shall conclude with one or more claims particularly pointing out and distinctly 
claiming the subject matter which the applicant regards as his invention. 

4. Claims 5 and 6 are rejected under 35 U.S.C. 112, second paragraph, as being 
indefinite for failing to particularly point put and distinctly claim the subject matter which 
applicant regards as the invention. 

5. Claims 5 and 6 recites the limitation "said transmitting" in claim 1 . There is 
insufficient antecedent basis for this limitation in the claim. Claim 1 does not recite said 
transmitting. However for the purposes of examination, claims 5 and 6 will be assumed 
to be dependent of claim 8. Claims 5 and 6 should also be renumbered to fall after 
claim 8. 

Claim Rejections - 35 USC § 102 

6. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(b) the invention was patented or described in a printed publication in this or a foreign country or in public 
use or on sale in this country, more than one year prior to the date of application for patent in the United 
States. 

7. Claims 1, 3, 4, 10, 11, 13, 14, and 16 are rejected under 35 U.S.C. 102(b) as 
being anticipated by Saunders (Real-Time Discrimination of Broadcast Speech/Music). 
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8. Consider claim 1 , Saunders teaches a method for classifying an audio signal (we 
describe a technique which is successful at discriminating speech from music; page 
993, column 1 , line 1 ), the method comprising: 

receiving an audio signal to be classified (this is a technique for discriminating 
speech from music from an FM broadcast; page 993, column 1 , line 2); 

analyzing selected audio signal components (The first step is to measure the 
ZCR of the signal over a 2.4 second segment of the data; page 994, column 2, line 43); 

recording a result of analysis of the selected audio signal components (would be 
inherent in order to compare it); 

comparing the recorded result of analysis to a threshold value (If this statistic 
exceeds a specific threshold, the distribution outside these bounds is significantly 
skewed and the waveform is likely speech; page 994, column 2, line 43); and 

classifying the audio signal based upon comparison of the recorded result of 
analysis and the threshold value (If this statistic exceeds a specific threshold, the 
distribution outside these bounds is significantly skewed and the waveform is likely 
speech; page 994, column 2, line 43). 

9. Consider claim 3, Saunders teaches the method according to claim 1 , wherein 
analyzing the selected audio signal components comprises counting zero point 
transitions of the selected audio signal components (The first step is to measure the 
ZCR of the signal over a 2.4 second segment of the data; page 994, column 2, line 43. 
Measuring the Zero Crossing Rate would entail counting the number of zero crossings). 
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10. Consider claim 4, Saunders teaches the method according to claim 1 , wherein 
recording a result of analysis of the selected audio signal components comprises 
recording a count value of a number of zero point transitions of the selected audio 
signal components (The first step is to measure the ZCR of the signal over a 2.4 second 
segment of the data; page 994, column 2, line 43. Measuring the Zero Crossing Rate 
would entail counting the number of zero crossings. This number would inherently have 
to be stored somewhere in order to process it or manipulate it). 

1 1 . Consider claim 1 0, Saunders teaches the method according to claim 1 , wherein 
classifying the audio signal occurs at a receiving end of an audio transmission system 
(this is a technique for discriminating speech from music from an FM broadcast; page 
993, column 1 , line 2). 

12. Consider claim 1 1 , Saunders teaches the method according to claim 1 , wherein 
the audio signal is one of an analog signal and a digital signal (A sample rate of 16Khz 
was chosen for this discrimination technique; page 995, column 1 line 1 . If something is 
sampled it is well understood that it is being converted to a digital signal, this is a 
technique for discriminating speech from music from an FM broadcast; page 993, 
column 1 , line 2. This further tells us that the signal started out as an analog signal as 
at the time of the publication of Saunders all FM broadcasts were analog.). 
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1 3. Consider claim 1.3, Saunders teaches the method according to claim 1 , wherein 
the threshold value used in the comparison determined through trial and error of a 
plurality of iterations in a comparing device (Data was collected manually by listening, 
collecting and storing features, and labeling the segment. A variety of content was 
processed, including talk, commercials, and many types of music. Once the classifier 
was trained, the parameters were stored and fed into the real-time feature 
extraction/classifier routine; page 995, column 1 , line 33). 

14. Consider claim 14, Saunders teaches the method according to claim 1 , wherein 
analyzing selected audio signal components comprises counting zero point transitions 
of the audio signal for a predetermined period of time (The first step is to measure the 
ZCR of the signal over a 2.4 second segment of the data; page 994, column 2, line 43. 
Measuring the Zero Crossing Rate would entail counting the number of zero crossings). 

15. Consider claim 16, Saunders teaches an apparatus for classifying an audio 
signal (The experimental setup used a Gradient AID unit attached to a workstation; 
page 995, column 1 , line 38), the apparatus comprising: 

a zero point counter for counting and recording zero point transitions 
encountered in analysis of the selected audio signal components (The first step is to 
measure the ZCR of the signal over a 2.4 second segment of the data; page 994, 
column 2, line 43); and 
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a comparator for comparing a recorded result of analysis to a threshold value 
and classifying the audio signal based upon comparison of the recorded result of 
analysis and the threshold value (If this statistic exceeds a specific threshold, the 
distribution outside these bounds is significantly skewed and the waveform is likely 
speech; page 994, column 2, line 43). 



Claim Rejections - 35 USC § 103 

16. The following is a quotation of 35 U.S.C. 1 03(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

1 7. The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1 , 1 48 
USPQ 459 (1966), that are applied for establishing a background for determining 
obviousness under 35 U.S.C. 103(a) are summarized as follows: 

1 . Determining the scope and contents of the prior art. 

2. Ascertaining the differences between the prior art and the claims at issue. 

3. Resolving the level of ordinary skill in the pertinent art. 

4. Considering objective evidence present in the application indicating 
obviousness or nonobviousness. 



18. Claims 2 and 17 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Saunders in view of Carey (A Comparison of Features for Speech, Music 
Discrimination). 
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19. Consider claim 2, Saunders teaches the method according to claim 1 , but does 
not specifically teach wherein classifying the audio signal based upon comparison of the 
recorded result of analysis and the threshold value further comprises: 

if the recorded result of analysis is greater than the threshold value, then the 
audio signal is determined to be music; and 

if the recorded result of analysis is less than the threshold value, then the audio 
signal is determined to be speech. 

In the same field of speech/music discrimination, Carey teaches if the recorded 
result of analysis is greater than the threshold value, then the audio signal is determined 
to be music (table 1 shows that the mean value of number of zero crossing (u) for music 
0. 1 8 is greater than that of speech 0. 1 7); and 

if the recorded result of analysis is less than the threshold value, then the audio 
signal is determined to be speech (table 1 shows the mean value of zero crossing for 
speech 0.17 was less that music 0.18). 

Although Saunders uses a slightly different zero crossing analysis method than 
does Carey, it would have been obvious to one of ordinary skill in the art at the time of 
the invention to use the parameters of Carey as this method would be computationally 
inexpensive (Carey page 151, column 2, section 4.4). 

20. Consider claim 17, Saunders teaches the apparatus according to claim 16, but 
does not specifically teach wherein classifying the audio signal based upon comparison 
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of the recorded result of analysis and the threshold value in the comparator further 
comprises: 

if the recorded result of analysis is greater than the threshold value, then the 
audio signal is determined to be music; and 

if the recorded result of analysis is less than the threshold value, then the audio 
signal is determined to be speech. 

In the same field of speech/music discrimination, Carey teaches if the recorded 
result of analysis is greater than the threshold value, then the audio signal is determined 
to be music (table 1 shows that the mean value of number of zero crossing (u) for music 
0.18 is greater than that of speech 0.17); and 

if the recorded result of analysis is less than the threshold value, then the audio 
signal is determined to be speech (table 1 shows the mean value of zero crossing for 
speech 0.17 was less that music 0.18). 

Although Saunders uses a slightly different zero crossing analysis method than 
does Carey, it would have been obvious to one of ordinary skill in the art at the time of 
the invention to use the parameters of Carey as this method would be computationally 
inexpensive (Carey page 151 , column 2, section 4.4). 

21. Claims 7, 9, 15, and 20-24 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Saunders in view of Benyassine (US Patent 6,694,293). 
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22. Consider claim 7, Saunders teaches the method according to claim 1 , but does 
not specifically teach wherein classifying the audio signal further comprises turning on a 
flag in a header of a packet of digital audio information, wherein the flag provides an 
indication of classification of the audio signal based upon comparison of the recorded 
result of analysis and the threshold value. 

In the same field of music and speech discrimination Benyassine teaches turning 
on a flag in a header of a packet of digital audio information (all flags used to mark 
audio frames are shown in Table 1 , column 9), wherein the flag provides an indication of 
classification of the audio signal based upon comparison of the recorded result of 
analysis and the threshold value (The music detection flag F M is set if either threshold 
for music conditions are met; column 7 line 37). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of invention to include a detection flag as taught by Benyassine with the speech and 
music discrimination method of Saunders in order to provide a method to pass the 
classification from one device to another. 

23. Consider claim 9, Saunders teaches the method according to claim 1 , but does 
not teach specifically wherein classifying the audio signal occurs at a transmitting end of 
an audio transmission system. 

However in the same field of music and speech discrimination Benyassine 
teaches classifying the audio signals at a transmitting end of an audio transmission 
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system (Figure 1 , encoder 112, part of transmission side, may contain a music classifier 
with voice activity detector; column 3, line 62.) 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to classify the music or voice at the transmitting side of the system as 
taught by Benyassine in order to determine properties of the signal in order to best 
encode the signal for transmission (Benyassine; column 1 line 62 - column 2 line 1 3). 

24. Consider claim 1 5, Saunders teaches the method according to claim 1 , but does 
not specifically teach further comprising: 

converting the audio signal from an analog signal to a digital signal; 

encoding the audio signal; 

packetizing the audio signal; 

transmitting the audio signal; 

decoding the audio signal; and 

processing the audio signal, wherein processing at least comprises one of storing 
the audio signal and playing the audio signal. 

However in the same field of music and speech discrimination Benyassine 
teaches converting the audio signal from an analog signal to a digital signal (figure 1 , 
A/D converter 108); 

encoding the audio signal (figure 1 , encoder 1 12); 
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packetizing the audio signal (communication devices 102 and 106 may be 
cellular telephones radios, or VoIP systems; column 3 line 6-11. Cell phones and VoIP 
systems both used packetized data); 

transmitting the audio signal (figure 1, signals are transmitted over 
communication medium 104); 

decoding the audio signal (using decoder 1 14, figure 1); and 

processing the audio signal, wherein processing at least comprises one of storing 
the audio signal and playing the audio signal (output of system is synthesized speech 
signal 120, figure 1). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to use the transmission scheme of Benyassine with the audio 
classification method of Saunders in order to provide an efficient way to effectively 
transmit audio signals (Benyassine; column 1 line 62 - column 2 line 13), 

25. Consider claim 20, Saunders teaches the apparatus according to claim 16, but 
does not specifically teach further comprising at least one of an audio signal encoder 
and an audio signal decoder. 

However in the same field of music and speech discrimination, Benyassine 
teaches at least one of an audio signal encoder (figure 1 , encoder 1 12) and an audio 
signal decoder (figure 1, decoder 114). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to use the transmission scheme of Benyassine with the audio 
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classification method of Saunders in order to provide an efficient way to effectively 
transmit audio signals (Benyassine column 1 line 62 - column 2 line 13), 

Consider claim 21, Benyassine teaches the apparatus according to claim 20, 
further comprising a speech/music classifying device being associated with the audio 
signal encoder (Figure 1, encoder 112, part of transmission side, may contain a music 
classifier with voice activity detector; column 3, line 62.). 

26. Consider claim 22, Saunders teaches the apparatus according to claim 20, 
further comprising a speech/music classifying device being associated with the audio 
signal decoder (this is a technique for discriminating speech from music from an FM 
broadcast; page 993, column 1, line 2. An FM signal must be decoded before it can be 
classified or played or manipulated in anyway). 

27. Consider claim 23, Saunders teaches the apparatus according to claim 20, 
further comprising a signal processor and an audio processing unit associated with the 
audio signal decoder (The experimental setup used a Gradient A/D unit attached to a 
workstation; page 995, column 1 , line 38. Using data processed on the fly and tuning 
the radio dial at will, the classification accuracy averaged between 95 and 96%; page 
995, column 1 , line 43. This is a technique for discriminating speech from music from 
an FM broadcast; page 993, column 1 , line 2. An FM signal must be decoded before it 
can be classified or played or manipulated in anyway). 
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28. Consider claim 24, Benyassine teaches the apparatus according to claim 20, 
further comprising a bitstream multiplexer associated with the audio signal decoder 
(signal (communication devices 102 and 106 may be cellular telephones radios, or VoIP 
systems; column 3 line 6-1 1 . Cell phones and VoIP systems both used packetized 
data. It is inherent that some kind of multiplexing must be employed in order to 
packetize the data). 

29. Claims 5, 6, 8, 18 and 19 are rejected under 35 U.S.C 103(a) as being 
unpatentable over Saunders in view of Benyassine as applied to claims 20 and above, 
and further in view of Pohlmann (Principles of Digital Audio). 

30. Consider claim 8, Saunders teaches the method according to claim 1 , further 
comprising: 

selecting a number of transmitted audio signal, components for analysis (The first 
step is to measure the ZCR of the signal over a 2.4 second segment of the data; page 
994, column 2, line 43.). 

However Saunders does not specifically teach transmitting components of the 
audio signal having a frequency less than a predetermined frequency. 

In the same field of audio analysis, Benyassine teaches transmitting an audio 
signal using encoder 112 of figure 1 that samples at a rate of 8000Hz; column 3, line 60. 
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Therefore it would have been obvious to combine the sampling of audio for 
transmitting of Benyassine with the classification system of Saunders in order to allow 
the transmissions of digital signals. 

This does not say specifically that that the audio being transmitted is less than a 
predetermined frequency. 

In the same field of audio encoding, Pohlmann teaches that sampled audio must 
be passed through a low pass filter at the Nyquist frequency in order to prevent 
distortion called aliasing; page 30, prevention section. 

Therefore it would have been obvious to combine the sampling of Benyassine 
with the filtering of Pohlmann in order to prevent aliasing, and to provide a way to 
digitize the audio signal for analysis, coding and transmission. 

31 . Consider claim 5, Pohlmann teaches, the method according to claim 8, wherein 
transmitting components of the audio signal having a frequency less than a 
predetermined frequency comprises passing the audio signal through a low pass filter, 
the low pass filter being adapted to permit transmission of frequencies below the 
predetermined frequency (sampled audio must be passed through a low pass filter at 
the Nyquist frequency in order to prevent distortion called aliasing; page 30, prevention 
section. This is made necessary by the sampling of Benyassine column 3, line 60.). 

32. Consider claim 6, Pohlmann teaches, the method according to claim 1 , wherein 
selecting a number of transmitted audio signal components for analysis comprises 
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passing transmitting digital audio components through a decimator, wherein every 1 in 
N audio signal components is transmitted and audio signal components between 1 and 
N are discarded. (This is nothing more than resampling the audio signal As noted many 
different sampling rates are used, devices cannot be connected when their sampling 
rates differ... For example a 44.1kHz signal can be converted to 44.056kHz by 
removing one sample every 23ms; page 460, fist full paragraph This could be carried to 
the extreme of reducing the sampling rate more drastically, such as converting from 
44kHz to 22kHz by dropping every other sample). 

33. Consider claim 18, Saunders teaches the apparatus according to claim 16, but 
does not specifically teach further comprising: 

a low pass filter for preventing transmission of components of the audio signal 
having a frequency greater than a predetermined frequency; and 

a decimator for selecting a reduced number of audio components for analysis. 

In the same field of audio analysis, Benyassine teaches transmitting an audio 
signal using encoder 1 12 of figure 1 that samples at a rate of 8000Hz; column 3, line 60. 

Therefore it would have been obvious to combine the sampling of audio for 
transmitting of Benyassine with the classification system of Saunders in order to allow 
the transmissions of digital signals. 

This does not say specifically that that the audio being transmitted is less than a 
predetermined frequency nor the use of a decimator. 
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In the same field of audio encoding, Pohlmann teaches that sampled audio must 
be passed through a low pass filter at the Nyquist frequency in order to prevent 
distortion called aliasing; page 30, prevention section. 

Therefore it would have been obvious to combine the sampling of Benyassine 
with the filtering of Pohlmann in order to prevent aliasing, and to provide a way to 
digitize the audio signal for analysis, coding and transmission. 

This combination does not teach specifically a decimator. But later in the book, 
Pohlmann teaches a decimator for selecting a reduced number of audio components for 
analysis. This is nothing more than resampling the audio signal As noted many different 
sampling rates are used, devices cannot be connected when their sampling rates 
differ... For example a 44.1 kHz signal can be converted to 44.056kHz by removing one 
sample every 23ms; page 460, fist full paragraph. 

Therefore it would have been obvious to one of ordinary skill in the art to include 
the decimating as taught by Pohlmann with the system of Saunders and Benyassine in 
order to provide a method for being able to connect different devices with different 
sampling rates (Pohlmann page 460, first full paragraph). 

34. Consider claim 1 9, Pohlmann teaches the apparatus according to claim 1 8, 
wherein the decimator selecting a reduced number of audio components for analysis 
comprises the decimator selecting every 1 in N audio signal components to be 
transmitted and selecting the audio signal components between 1 and N to be 
discarded (This is nothing more than resampling the audio signal As noted many 
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different sampling rates are used, devices cannot be connected when their sampling 
rates differ. . . For example a 44. 1 kHz signal can be converted to 44.056kHz by 
removing one sample every 23ms; page 460, fist full paragraph. This could be carried 
to the extreme of reducing the sampling rate more drastically, such as converting from 
44kHz to 22kHz by dropping every other sample). 

35. Claim 12 is rejected under 35 U.S.C. 103(a) as being unpatentable over 
Saunders. Saunders teaches the method according to claim 1 , but does not specifically 
teach wherein the threshold value used in the comparison is pre-determined and pre-set 
by a user. 

However Saunders does teach Data was collected manually by listening, 
collecting and storing features, and labeling the segment. A variety of content was 
processed, including talk, commercials, and many types of music. Once the classifier 
was trained, the parameters were stored and fed into the real-time feature 
extraction/classifier routine; page 995, column 1 , line 33. 

With data being collected manually, it must be entered manually, and although is 
not specifically the threshold, one of ordinary skill in the art that the training of the 
classifier by manually collecting data is changing the threshold. Therefore in fact, the 
user is in a way changing the threshold value is preset and determined by the user. 
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Conclusion 



Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Douglas C. Godbold whose telephone number is (571) 
270-1451. The examiner can normally be reached on Monday-Thursday 7:00am- 
4:30pm Friday 7:00am-3:30pm. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Patrick Edouard can be reached on (571) 272-7603. The fax phone number 
for the organization where this application or proceeding is assigned is 571-273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 
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PATRICK N. EDOUARD 
SUPERVISORY PATENT EXAMINER 



